我们介绍了一个新的数据集,以通过口头答案对知识图(kgs)回答对话问题。目前,关于KGS的问题回答是针对单转问题的答案(KGQA)或多型对话对话问题答案(Convqa)。但是,在现实情况下(例如,Siri,Alexa和Google Assistant等语音助手),用户更喜欢口头上的答案。本文通过将现有的ConvQA数据集扩展到具有多种释义的言语答案,从而为最先进的方法做出了贡献。我们使用五个序列到序列模型进行实验,以生成答案响应,同时保持语法正确性。我们还执行错误分析,该分析详细介绍了模型在指定类别中的错误预测率。我们提出的随着答案语言扩展的数据集可公开使用,其中包含有关其更广泛用途的使用的详细文档。
translated by 谷歌翻译
知识图,例如Wikidata,包括结构和文本知识,以表示知识。对于图形嵌入和语言模型的两种方式中的每种方法都可以学习预测新型结构知识的模式。很少有方法与模式结合学习和推断,而这些现有的方法只能部分利用结构和文本知识的相互作用。在我们的方法中,我们以单个方式的现有强烈表示为基础,并使用超复杂代数来表示(i),(i),单模式嵌入以及(ii),不同方式之间的相互作用及其互补的知识表示手段。更具体地说,我们建议4D超复合数的二脑和四个元素表示,以整合四个模态,即结构知识图形嵌入,单词级表示(例如\ word2vec,fastText,fastText),句子级表示(句子transformer)和文档级表示(句子级别)(句子级别)(句子级表示)(句子变压器,doc2vec)。我们的统一矢量表示通过汉密尔顿和二脑产物进行标记的边缘的合理性,从而对不同模态之间的成对相互作用进行建模。对标准基准数据集的广泛实验评估显示了我们两个新模型的优越性,除了稀疏的结构知识外,还可以提高链接预测任务中的性能。
translated by 谷歌翻译
知识图嵌入模型已成为机器学习的重要领域。这些模型在知识图中提供了实体和关系的潜在表示,然后可以在下游机器学习任务(例如链接预测)中使用。这些模型的学习过程可以通过对比正面和负三元组来执行。虽然所有千克的三元组都被认为是正的,但负三元三联通常不容易获得。因此,获得的采样方法的选择在知识图嵌入模型的性能和有效性中起着至关重要的作用。当前的大多数方法从基础知识图中实体的随机分布中获取负面样本,这些样本通常还包括毫无意义的三元组。其他已知方法使用对抗技术或生成神经网络,从而降低了过程的效率。在本文中,我们提出了一种方法,以产生有关实体的可用互补知识的信息负面样本。特别是,预训练的语言模型用于通过利用实体之间的距离来形成邻里群集,以通过其文本信息获得符号实体的表示。我们的全面评估证明了拟议方法在基准知识图上具有链接预测任务的文本信息的有效性。
translated by 谷歌翻译
每年国际语义网络会议组织一套语义网络挑战,以建立将在一些问题领域推进最先进的解决方案的竞争。语义答案类型和关系预测任务(SMART)任务是ISWC 2021语义网络挑战之一。这是在ISWC 2020成功智能2020后的挑战的第二年。今年的版本侧重于两个对知识库问题应答(KBQA)的非常重要的子任务:答案类型预测和关系预测。问题类型和答案类型预测可以在知识库问题应答系统中发挥关键作用,提供关于有助于生成正确查询或排名答案候选人的预期答案的见解。鉴于自然语言的问题更具体地说,第一个任务是使用目标本体预测答案类型(例如,DBPedia或Wikidata。类似地,第二个任务是识别自然语言查询中的关系并将它们链接到目标本体中的关系。本文讨论了任务描述,基准数据集和评估指标。有关更多信息,请访问https://smart-task.github.io/2021/。
translated by 谷歌翻译
Wikidata是一个经常更新,社区驱动和多语言知识图形。因此,Wikidata是实体联系的一个有吸引力的基础,这是最近发表论文的增加显而易见的。该调查侧重于四个主题:(1)存在哪些Wikidata实体链接数据集,它们是多么广泛使用,它们是如何构建的? (2)对实体联系数据集的设计进行Wikidata的特点,如果是的话,怎么样? (3)当前实体链接方法如何利用Wikidata的特定特征? (4)现有实体链接方法未开发哪种Wikidata特征?本次调查显示,当前的Wikidata特定实体链接数据集在其他知识图表中的方案中的注释方案中没有不同。因此,没有提升多语言和时间依赖数据集的可能性,是自然适合维基帽的数据集。此外,我们表明大多数实体链接方法使用Wikidata以与任何其他知识图相同的方式,因为任何其他知识图都缺少了利用Wikidata特定特征来提高质量的机会。几乎所有方法都使用标签等特定属性,有时是描述,而是忽略超关系结构等特征。因此,例如,通过包括超关系图嵌入或类型信息,仍有改进的余地。许多方法还包括来自维基百科的信息,这些信息很容易与Wikidata组合并提供有价值的文本信息,Wikidata缺乏。
translated by 谷歌翻译
最近公布的知识图形嵌入模型的实施,培训和评估的异质性已经公平和彻底的比较困难。为了评估先前公布的结果的再现性,我们在Pykeen软件包中重新实施和评估了21个交互模型。在这里,我们概述了哪些结果可以通过其报告的超参数再现,这只能以备用的超参数再现,并且无法再现,并且可以提供洞察力,以及为什么会有这种情况。然后,我们在四个数据集上进行了大规模的基准测试,其中数千个实验和24,804 GPU的计算时间。我们展示了最佳实践,每个模型的最佳配置以及可以通过先前发布的最佳配置进行改进的洞察。我们的结果强调了模型架构,训练方法,丢失功能和逆关系显式建模的组合对于模型的性能来说至关重要,而不仅由模型架构决定。我们提供了证据表明,在仔细配置时,若干架构可以获得对最先进的结果。我们制定了所有代码,实验配置,结果和分析,导致我们在https://github.com/pykeen/pykeen和https://github.com/pykeen/benchmarking中获得的解释
translated by 谷歌翻译
Variational inference uses optimization, rather than integration, to approximate the marginal likelihood, and thereby the posterior, in a Bayesian model. Thanks to advances in computational scalability made in the last decade, variational inference is now the preferred choice for many high-dimensional models and large datasets. This tutorial introduces variational inference from the parametric perspective that dominates these recent developments, in contrast to the mean-field perspective commonly found in other introductory texts.
translated by 谷歌翻译
The release of ChatGPT, a language model capable of generating text that appears human-like and authentic, has gained significant attention beyond the research community. We expect that the convincing performance of ChatGPT incentivizes users to apply it to a variety of downstream tasks, including prompting the model to simplify their own medical reports. To investigate this phenomenon, we conducted an exploratory case study. In a questionnaire, we asked 15 radiologists to assess the quality of radiology reports simplified by ChatGPT. Most radiologists agreed that the simplified reports were factually correct, complete, and not potentially harmful to the patient. Nevertheless, instances of incorrect statements, missed key medical findings, and potentially harmful passages were reported. While further studies are needed, the initial insights of this study indicate a great potential in using large language models like ChatGPT to improve patient-centered care in radiology and other medical domains.
translated by 谷歌翻译
Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.
translated by 谷歌翻译
Over the years, sequential Monte Carlo (SMC) and, equivalently, particle filter (PF) theory has gained substantial attention from researchers. However, the performance of the resampling methodology, also known as offspring selection, has not advanced recently. We propose two deterministic offspring selection methods, which strive to minimize the Kullback-Leibler (KL) divergence and the total variation (TV) distance, respectively, between the particle distribution prior and subsequent to the offspring selection. By reducing the statistical distance between the selected offspring and the joint distribution, we obtain a heuristic search procedure that performs superior to a maximum likelihood search in precisely those contexts where the latter performs better than an SMC. For SMC and particle Markov chain Monte Carlo (pMCMC), our proposed offspring selection methods always outperform or compare favorably with the two state-of-the-art resampling schemes on two models commonly used as benchmarks from the literature.
translated by 谷歌翻译